On the Margin Explanation of Boosting Algorithms
نویسندگان
چکیده
Much attention has been paid to the theoretical explanation of the empirical success of AdaBoost. The most influential work is the margin theory, which is essentially an upper bound for the generalization error of any voting classifier in terms of the margin distribution over the training data. However, Breiman raised important questions about the margin explanation by developing a boosting algorithm arc-gv that provably generates a larger minimum margin than AdaBoost. He also gave a sharper bound in terms of the minimum margin, and argued that the minimum margin governs the generalization. In experiments however, arc-gv usually performs worse than AdaBoost, putting the margin explanation into serious doubts. In this paper, we try to give a complete answer to Breiman’s critique by proving a bound in terms of a new margin measure called Equilibrium margin (Emargin). The Emargin bound is uniformly sharper than Breiman’s minimum margin bound. This result suggests that the minimum margin is not crucial for the generalization error. We also show that a large Emargin implies good generalization. Experimental results on benchmark datasets demonstrate that AdaBoost usually has a larger Emargin and a smaller test error than arc-gv, which agrees well with our theory.
منابع مشابه
Margin Distribution Controlled Boosting
Schapire’s margin theory provides a theoretical explanation to the success of boosting-type methods and manifests that a good margin distribution (MD) of training samples is essential for generalization. However the statement that a MD is good is vague, consequently, many recently developed algorithms try to generate a MD in their goodness senses for boosting generalization. Unlike their indire...
متن کاملOn the generalization of soft margin algorithms
Generalization bounds depending on the margin of a classifier are a relatively recent development. They provide an explanation of the performance of state-of-the-art learning systems such as support vector machines (SVMs) [1] and Adaboost [2]. The difficulty with these bounds has been either their lack of robustness or their looseness. The question of whether the generalization of a classifier ...
متن کاملNon-Convex Boosting Overcomes Random Label Noise
The sensitivity of Adaboost to random label noise is a well-studied problem. LogitBoost, BrownBoost and RobustBoost are boosting algorithms claimed to be less sensitive to noise than AdaBoost. We present the results of experiments evaluating these algorithms on both synthetic and real datasets. We compare the performance on each of datasets when the labels are corrupted by different levels of i...
متن کاملBoosting Based on a Smooth Margin
We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can maximize in order to produce a maximum margin classifier. Our first algorithm is simply coordinate ascent on this function, involving a line se...
متن کاملA Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin
Much attention has been paid to the theoretical explanation of the empirical success of AdaBoost. The most influential work is the margin theory, which is essentially an upper bound for the generalization error of any voting classifier in terms of the margin distribution over the training data. However, important questions were raised about the margin explanation. Breiman (1999) proved a bound ...
متن کامل